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ABSTRACT 

Recognition of the 3 -splice site is a key step in pre- 
mRNA splicing and accomplished by a dynamic 
complex comprising splicing factor 1 (SF1) and the 
U2 snRNP auxiliary factor 65-kDa subunit (U2AF65). 
Both proteins mediate protein-protein and protein- 
RNA interactions for cooperative RNA-binding 
during spliceosome assembly. Here, we report the 
solution structure of a novel helix-hairpin domain in 
the N-terminal region of SF1 (SF1 NTD ). The nuclear 
magnetic resonance- and small-angle X-ray 
scattering-derived structure of a complex of the 
SF1 NTD with the C-terminal U2AF homology motif 
domain of U2AF65 (U2AF65 UHM ) reveals that, in 
addition to the known U2AF65 UHM -SF1 interaction, 
the helix-hairpin domain forms a secondary, hydro- 
phobic interface with U2AF65 UHM , which locks the 
orientation of the two subunits. Mutational analysis 
shows that the helix hairpin is essential for coopera- 
tive formation of the ternary SF1-U2AF65-RNA 
complex. We further show that tandem serine phos- 
phorylation of a conserved Ser80-Pro81-Ser82- 
Pro83 motif rigidifies a long unstructured linker 
in the SF1 helix hairpin. Phosphorylation does 
not significantly alter the overall conformations 
of SF1, SF1-U2AF65 or the SF1-U2AF65-RNA 



complexes, but slightly enhances RNA binding. 
Our results indicate that the helix-hairpin domain 
of SF1 is required for cooperative 3 -splice site 
recognition presumably by stabilizing a unique 
quaternary arrangement of the SF1-U2AF65-RNA 
complex. 

INTRODUCTION 

Removal of non-coding sequences (introns) from 
pre-mRNAs is a key step in mammalian gene expression 
performed by the spliceosome, a large and dynamic 
ribonucleoprotein particle. Alternative splicing is essential 
to generate different proteins from a given primary tran- 
script by differential inclusion or exclusion of coding 
sequences (exons) (1). Rather weakly conserved RNA 
sequence motifs located at the 5'- and Spends of an 
intron are first recognized by splicing factors (SFs) in 
the early complex E during the assembly of the 
spliceosome, which is then converted into complexes A, 
B and finally into complex C where splicing catalysis 
occurs (2). Given the importance of splicing for gene ex- 
pression, numerous diseases have been linked to aberrant 
splicing and mutations within the consensus sequences at 
the 5'- and 3 ; -ends of introns interfere with spliceosome 
assembly (3). Recently, frequent missense mutations in 
genes of SFs that mediate S'-splice site recognition, such 
as SF1, and the U2 snRNP auxiliary factor (U2AF) large 
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(U2AF65) and small subunits (U2AF35) have been linked 
to tumourigenesis (4). 

The regulation of splicing by extracellular signals and 
signal transduction cascades is poorly understood. With 
respect to S'-splice site recognition, phosphorylation of 
conserved serine residues (Ser20, Ser80 and Ser82) in 
SF1 has been implicated in regulating the stability of the 
ternary SF1-U2AF65-RNA complex in vitro (5,6). 
However, both the structural and functional roles of 
SF1 phosphorylation remain elusive. 

Early recognition of the intron/exon junctions is a key 
regulatory step in splicing. In the initial complex E, the Ul 
snRNP binds to a short stretch of 6 nt at the S'-splice site, 
whereas the 3'-splice site is defined by the binding of SF1 to 
the branch point sequence (BPS), of U2AF65 to the 
poly-pyrimidine tract (Py tract), and of U2AF35 to the 
AG dinucleotide defining the 3'-splice site, with the add- 
itional involvement of proof-reading factors (7,8) 
(Figure 1A). The structure of the Ul snRNP involved in 
S'-splice site recognition has been reported recently (9). 
Structural information about 3'-splice site recognition is 
so far limited to binary protein-protein and protein-RNA 
complexes (10-14), involving SF1, U2AF and RNA. 
However, given the dynamic nature of protein-RNA inter- 
actions that mediate 3'-splice site recognition (12), its 
complete structural analysis should include solution 
approaches to detect and characterize functionally 
relevant conformational dynamics (15). Details of intron 
RNA recognition are available based on 3D structures of 
the RNA-binding regions of SF1 (comprising the KH and 
QUA2 domains, SF1 KHQUA2 ) bound to BPS RNA (10), 
and of the tandem U2AF65 RRM1-RRM2 domains 
(U2AF65 RRMU ) with Py tract RNA (12). Structural 
details of a protein-protein interaction between U2AF65 
and SF1 have been reported (11). These involve the non- 
canonical U2AF65 RRM3, a founding member of the 
U2AF homology motif domain (U2AF65 UHM ) (16) and a 
tryptophan-containing peptide sequence, called UHM 
ligand motif (17), in SF1 (SF1 ULM ) (11) (Figure IB). 

Here, we present structural and biochemical analyses 
of novel interactions in the SF1-U2AF65 complex in 
solution by combining nuclear magnetic resonance 
(NMR) spectroscopy and small-angle X-ray scattering 
(SAXS). We show that the N-terminus of SF1 (SF1 NTD ) 
comprises a helix hairpin fold (SF1 HH ) providing an add- 
itional binding interface to U2AF65 UHM . This interface 
involves mainly hydrophobic interactions between the 
sf1 ntd hdical hair p in and U2AF65 UHM . SF1 NTD is 

essential for cooperative formation of the ternary SF1- 
U2AF65-RNA complex. To assess the effects of the 
recently reported tandem serine phosphorylation of a 
Ser80-Pro81-Ser82-Pro83 (SPSP) motif in SF1 NTD , we 
performed structural and functional studies of phos- 
phorylated SF1 alone and in complex with U2AF65. 
These studies show that phosphorylation does not notice- 
ably affect the conformation of SF1 or SF1-U2AF65 and 
does not modulate cooperative RNA binding, thus sug- 
gesting additional roles for SF1 phosphorylation. Our 
results represent a significant step towards solving the 
structure of the ternary SF1-U2AF65-RNA complex 



and thus understanding the molecular basis of S'-splice 
site recognition. 

MATERIALS AND METHODS 

Protein expression and NMR sample preparation 

Plasmids for expressing human U2AF65 UHM (residues 
372-475), U2AF65 rrmY23 (residues 147-475), as well as 
human SF1 ULM (residues 1-25), SF1 HH (residues 27-145), 
sf1 ntd ( residues i_i45) an d SF1 (residues 1-260) were 
prepared in a modified pET-Mll vector containing an 
N-terminal His 6 tag followed by a tobacco etch virus 
(TEV) protease cleavage site. SF1 deletions of helix ocl 
(A35-68), oc2 (A96-128), the complete helix hairpin 
(A 35- 128) or the ocl-oc2 linker (A75-90) were generated 
by polymerase chain reaction (PCR) of SF1 2-320 in 
pTRCHis A (20) with reverse and forward primers com- 
plementary to sequences upstream and downstream, 
respectively, of the desired deletion. The primers con- 
tained Kpnl restriction sites and PCR products were 
ligated after Kpnl digestion. The deleted protein 
sequences were thus replaced by Gly-Thr. Full-length 
KIS kinase (comprising the kinase domain and a UHM 
domain) was cloned by PCR ligation of the DNA 
encoding the short KIS isoform (residues 1-344) and the 
KIS UHM (residues 315-419). All plasmids were verified 
by sequencing. 

Expression plasmids were transformed into Escherichia 
coli BL21(DE3) cells, grown in standard LB medium or in 
minimal M9T medium supplemented with 2 g/1 [U- 13 C]- 
glucose and/or 1 g/1 [ 15 N] -ammonium chloride as the sole 
carbon and nitrogen sources. SF1 (residues 1-260) and 
U2AF65 RRM123 were prepared as [U- 2 H, 15 N]-labelled 
proteins for NMR titrations and relaxation measurements 
or as [U- 2 H, 13 C, 15 N]-deuterated samples with methyl 
protonation of isoleucine, leucine and valine residues as 
described (21) for chemical shift assignments. Protein syn- 
thesis was induced by the addition of 0.5 mM Isopropyl 
(3-D-l-thiogalactopyranoside (IPTG) at OD 600 -0.8. After 
protein expression for 16 h at 25° C, cells were collected by 
centrifugation, lysed by sonication in the presence of 
lysozyme and ethylenediaminetetraacetic acid (EDTA)- 
free 'complete protease inhibitor' (Roche Applied 
Science), then re-suspended in binding buffer consisting 
of 50 mM Tris (pH 8.0), 500 mM NaCl, 5% (v/v) 
glycerol and 5mM imidazole. The sample was loaded 
onto Ni-NTA chromatography resin (Qiagen) and 
washed with 20 column volumes of binding buffer 
followed by five column volumes of the same buffer but 
with 25 mM imidazole, and then the sample was eluted 
with the buffer containing 250 mM imidazole. The His6- 
tag was cleaved by incubating samples with 0.1 mg TEV 
proteinase/mg protein sample and 2mM DTT at 4C for 
12h. The TEV protease, the histidine-tag and uncleaved 
protein were removed by a second passage of the sample 
through Ni-NTA resin. The eluate was further purified by 
gelfiltration on a Super dex 75 column (GE) using sodium 
phosphate (pH 6.5), 50 mM NaCl, 1 mM Dithiothreitol 
(DTT) as buffer. NMR samples were concentrated from 
100 to 600 uM in NMR buffer consisting of 20 mM 
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Figure 1. Proteins and domains involved in 3 ; -splice site recognition. (A) Diagram of the SF1-U2AF65 S'-splice site recognition complex repre- 
senting domains of SF1 and U2AF65 (HH, helix hairpin; KH, hnRNP K homology; QUA2, quaking homology-2; RRM, RNA recognition motif; 
ULM, U2AF homology domain (UHM) ligand motif). SF1 and U2AF65 are coloured blue and green, respectively, with the same colour code for 
the proteins being used in the NMR spectra throughout the article. (B) Diagram of the protein constructs used. (C) Sequence alignment of SF1 HH 
domains. Sequences were taken from the UniProt database (www.uniprot.org) and residue numbers are given for Homo sapiens SF1. The secondary 
structure of SF1 HH is depicted above the alignment. The phosphorylated serine residues (Ser80 and Ser82) are indicated in red. Sequences were 
aligned with ClustalW (18) and analysed with Jalview 2 (19). 



sodium phosphate (pH 6.5), 50 mM NaCl, 0.1% sodium 
azide, 1 mM DTT and 1 mM EDTA. 

KIS kinase was expressed in standard LB medium fol- 
lowing the same protocol. The protein was purified at 4°C 
without cleavage and removal of the His 6 -tag. Aliquots 
of the recombinant KIS kinase were stored at — 80° C 
in buffer consisting of 50 mM 2-(N-morpholino)etha- 
nesulfonic acid (MES) (pH 8.0), 15mM MgCl 2 , 25% (v/ 
v) glycerol, 2mM DTT and 1 mM EDTA. 

His 6 -tagged SF1 2-320 and deletion mutants for gel shift 
experiments were expressed in E. coli and purified as 
described (20). Proteins were dialyzed against 20 mM 
Hepes-KOH pH 7.9, 100 mM KC1, 20% (v/v) glycerol, 
0.2 mM EDTA and 0.5 mM DTT. 

A 20-mer RNA containing the BPS and the Py tract 
(5 / -UAUACUAACAAUUUUUUUUU-3 / ) was purcha- 
sed from BioSpring GmbH, dissolved in H 2 0 to a final 
concentration of lOmM and added to the protein samples 
before the NMR and SAXS measurements. 

In vitro phosphorylation 

Phosphorylation of SF1 NTD and SF1 for NMR analysis 
was performed in 10 ml kinase buffer containing 50 mM 
MES (pH 8.0), 15mM MgCl 2 , 25% (v/v) glycerol, 5mM 
DTT, 1 mM EDTA, 0.5 ml of 100 mM adenosine triphos- 
phate (ATP) stock solution, 1ml 10 x phos-stop (Roche), 
40 jig of KIS kinase and 2mg of substrate; lOOmM ATP 
stock solution was prepared by dissolving ATP powder 
(Sigma) in kinase buffer and adjusting the pH to 8.0. 



Aliquots were stored at — 20°C. The reaction typically 
required 24-48 h at 30° C to fully phosphorylate both 
Ser80 and Ser82. The phosphorylation products were 
purified with a MonoQ ion exchange column (GE). The 
reaction products were buffer exchanged to MonoQ buffer 
consisting of 20 mM Tris-HCl (pH 7.2) and loaded onto 
the column. The NaCl concentration in the elution buffer 
was gradually increased over 40 column volumes from 0 to 
1 M, and 1-ml fractions were collected. The phosphoryl- 
ation products were separated depending on the phos- 
phorylation state. The double phosphorylation of Ser80 
and Ser82 in phosphorylated SFr TD (pSFl NTD ) and 
SF1 (pSFl) was confirmed by sodium dodecyl sulphate- 
polyacrylamide gel electrophoresis (SDS-PAGE), mass 
spectrometry (Supplementary Figure SI) and, for isotope 
labelled samples, by NMR spectroscopy (Figure 5). 

NMR spectroscopy 

All samples were measured at 298 K in NMR buffer with 
10% 2 H 2 0 added for the lock. NMR spectra were 
recorded on AVIII 500, AVIII 600, AVIII 750, AVIII 
800 or AVI 900 Bruker NMR spectrometers, equipped 
with cryogenic triple resonance gradient probes (with the 
exception of the AVIII 750 MHz). Data were processed in 
NMRPipe/Draw (22) and analysed in Sparky 3 (T.D. 
Goddard and D.G. Kneller, University of California). 
Protein backbone assignments were obtained from 
HNCACB and HNCA spectra, and by comparison of 
^^N-HSQC and -TROSY spectra with previously 
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reported data (11,23). Amino acid side chain resonance 
assignments were obtained from HCCH-TOCSY, 15 N- 
and 13 C-edited NOESY-HSQC/HMQC experiments (24). 
Intermodular NOEs between the U2AF65 UHM and 
SF1 NTD were identified for well-resolved peaks in the 3D 
13 C- and 15 N-edited NOESY-HSQC experiments. H N -N, 
N-C and H N -C residual dipolar couplings for SF1 HH 
were recorded using HNCO-based NMR experiments 
(24). Alignment media consisted of 15mg/ml Pfl phage 
(Profos AG, Regensburg, Germany) (25). 15 N R 1? R lp 
relaxation rates and {^Hj-^N heteronuclear NOEs of 
the SF1 NTD and U2AF65 tjHM complexes were recorded 
at 750 MHz, of SF1 NTD at 600 MHz and of SF1 at 
800 MHz proton Larmor frequency at 298 K as described 
(26). Local correlation times were derived from the 15 N 
R 2 /Ri ratio (26). 

Structure calculations 

The structure of SF1 HH was determined using standard 
approaches. Automated NOESY cross-peak assignments 
and structure calculations with torsion angle dynamics 
were performed with CYANA 3.0 (27). NOE-derived 
distance restraints derived from CYANA together with 
cp and v|/ backbone dihedral angle restraints derived 
from TALOS+ (28) based on secondary chemical shifts 
and residual dipolar couplings were used during the struc- 
ture calculation and for final water refinement (29). 
Structures were validated with iCing (http://nmr.cmbi.ru. 
nl/icing/). Molecular images were generated with PyMol 
(Schrodinger). 

The structure of the SF1 NTD -U2AF65 UHM complex 
was calculated using our previously reported protocol 
(30). The SF1 HH structure determined here and the struc- 
ture of the U2AF65 UHM /SF1 ULM complex (11) were used 
as input structures for semi-rigid calculation of the 
complex structure. The input structures were maintained 
using non-crystallographic symmetry restraints with a 
modified version of Arial.2/CNS (30). Distance restraints 
derived from intermolecular NOEs and backbone torsion 
angle restraints derived from TALOS+ (28) were 
employed during molecular dynamics and simulated 
annealing. The final structures were refined in a shell of 
water molecules (29). 

Small-angle X-ray scattering 

SAXS data for solutions of SF1 NTD , pSFP TD , SF1, pSFl, 
U2AF65 UHM , U2AF65~^ 



RRM123 



, SF1 NTD -U2AF65 UHM , 
pSFl- 



SF1-U2AF65 



RRM123 



P SF1 NTD -U2AF65 UHM , 

U2AF65 RRM123 , SF1-U2AF65 RRM123 -RNA and pSFl- 
U2AF65 RRM123 -RNA were recorded at the X33 beamline 
of the European Molecular Biology Laboratory at 
Deutsches Elektronen Synchrotron (DESY, Hamburg) 
using a MAR345 image plate detector. The scattering 
patterns were measured with a 2-min exposure time (eight 
frames, each 15 s) for several solute concentrations in the 
range from 1 to lOmg/ml. Radiation damage was excluded 
based on a comparison of individual frames of the 2-min 
exposures, where no changes were detected. Using the 
sample-detector distance of 2.7 m, a range of momentum 
transfer of 0.01 < s < 0.6 A -1 was covered (s = 4tt sm(0)/l, 



where 20 is the scattering angle and X = 1.5 A is the X-ray 
wavelength). 

All SAXS data were analyzed with the package ATS AS. 
The data were processed using standard procedures and 
extrapolated to infinite dilution with the program 
PRIMUS (31). The forward scattering, 1(0), and the 
radius of gyration, R g , were evaluated using the Guinier 
approximation. The values of 1(0) and R g as well as the 
maximum dimension, D mSLX , and the inter-atomic distance 
distribution functions, (P(R)), were computed with the 
program GNOM (32). The scattering from the high-reso- 
lution models was computed with the program CRYSOL 
(33). The masses of the solutes were evaluated by compari- 
son of the forward scattering intensity with that of a 
bovine serum albumin reference solution (molecular 
mass 66kDa). For back calculation of SAXS data from 
the NMR ensemble of the SF1 NTD -U2AF65 UHM 
complex, the flexible regions (SF1 NTD : 1-12, 26-38, 
70-95, 128-145) were randomized with CORAL (34). 



Isothermal titration calorimetry 

Calorimetric titrations were performed using an iTC200 
microcalorimeter (MicroCal) at 25° C. The buffer used 
for the protein and ligand samples was 20 mM sodium 
phosphate (pH 6.5) and 50 mM NaCl. The 200 -|il 
sample cell was filled with a 5 or lOuM solution of 
protein and the 40-(il injection syringe with a 50 or 
100 uM of the titrating ligand. Each titration consisted 
of a preliminary 0.2 -jil injection followed by 20 subse- 
quent 2-jil injections. The heat of the injections was cor- 
rected for the heat of dilution of every ligand into 
the buffer. At least two replicas were performed for each 
experiment. Binding thermodynamic models correspond- 
ing to bimolecular complex formation were fitted using 
routines provided by the manufacturer. 

Electrophoretic mobility shift assays 

The S'-splice site RNA (GGUCAUACUAACCCUGUCC 
CUUUUUUUUCCACAG|C; | denotes the 3 / -splice site) 
is derived from the 3 / -splice site of AdML intron 1 by 
replacing the original BPS with a consensus BPS 
(underlined). The RNA was synthesized with the T7- 
MEGA shortscript kit (Ambion) in the presence of [oc- 32 P] 
UTP and gel purified. RNA binding was performed for 
15min at room temperature in 10 -ul reactions containing 
U2AF65 RRM123 , SF1 proteins and 50pmol [oc- 32 P] UTP- 
RNA. Assays with SF1 2-320 and deletion mutants were per- 
formed in the presence of 12% (v/v) glycerol, 12 mM 
Hepes-KOH (pH 7.9), 4mM potassium-phosphate (pH 
6.5), 60 mM KC1, 10 mM NaCl, 1.8 mM MgCl 2 , 0.5 mM 
DTT, 0.12mM EDTA, 5 ug tRNA and 8U RiboShield™ 
ribonuclease inhibitor (Dundee Cell Products). Assays 
with SF1 and pSFl contained final concentrations of 
8mM potassium-phosphate (pH 6.5), 20 mM NaCl, 
0.5 mM DTT, 5 (ig tRNA and 8U RiboShield™ ribonucle- 
ase inhibitor. Reaction products were resolved in native 
5% polyacrylamide gels (aery lamide:bisacry amide = 80:1) 
in 0.5 x Tris/Borate/EDTA (TBE) at 4°C for 3 h at 150 V. 
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RESULTS 

The N-terminal region of SF1 adopts a novel helix-hairpin 
structure 

The 3D structures of the SF1 ULM and SF1 KHQUA2 regions 
bound to U2AF65 and intron RNA, respectively, have 
been reported previously (11,23). However, the region 
C-terminal of the SF1 UL ™ up to the KH-QUA2 domain 
of SF1 has not been studied. A multiple sequence align- 
ment of SF1 shows that this region, which also harbours 
the tandem serine motif that is phosphorylated by KIS 
kinase (5), is highly conserved (Figure 1C). We therefore 
cloned and expressed recombinant proteins comprising 
residues 1-145 (SF1 NTD ) and 27-145 (SF1 HH ) for NMR 
analysis (Figure IB). NMR data demonstrate that 
SF1 TD comprises a structured domain consisting of 
two oc-helical regions that are interrupted by a long dis- 
ordered linker (residues 69-95) (Supplementary Figures S2 
and S3). As the free ULM region is intrinsically disordered 
(11), we determined the 3D structure of the region 
comprising residues 27-145 (SF1 HH ) based on distance, 
dihedral angle and residual dipolar coupling restraints 
(Figure 2, Supplementary Table SI). The structure was 
further validated by comparison of measured and 
back-calculated relaxation rate enhancements upon 
addition of the paramagnetic co-solvent Gd(DTPA- 
BMA) (Supplementary Figure S3) (35). SF1 HH comprises 
a helix-hairpin motif with two oc-helices in an anti-parallel 
arrangement that are connected by a flexible linker 
(Figure 2), which contains the SPSP tandem phosphoryl- 
ation motif. Hydrophobic residues within the two helices 
are spaced every three to four residues, i.e. Ala51, Val54, 
Ile58, Leu61, and Leu65 in ocl contact Leul05, Leull2, 
Metll6 and Leull9 in oc2, respectively, and thereby sta- 
bilize the arrangement of the two helices. Argl09 in helix 
a2 is engaged in a potential salt bridge with Asp 128 
C-terminal of helix oc2 (Figure 2B). The C-terminal 
region adopts an extended conformation and packs 
against the helix hairpin by hydrophobic interactions of 
Phel23, Pro 126 and Tyrl29 with residues in both oc-helices 
(Ile58, Del 13 and Metll6) (Figure 2B). In addition, 
residues N-terminal of ocl (Ile40 and Leu44) form hydro- 
phobic contacts with residues located within ocl (Tyr52, 
Ile53 and Leu56; Figure 2C). The additional interactions 
involving regions preceding the N- and C-terminal ends of 
the helix hairpin are confirmed by 15 N relaxation data 
(Supplementary Figure S2A), which show that these 
extensions are rigid and thus stably interact with the 
helix hairpin. Thus, SF1 HH comprises a helix-hairpin 
fold, which exposes the SPSP phosphorylation motif in a 
flexible linker. 



SF1 NTD provides a secondary interface with U2AF65 UHM 

To determine whether the helix hairpin contributes to the 
SF1-U2AF65 interaction, we determined binding 
affinities of SF1 and U2AF65 fragments using isothermal 
titration calorimetry (Table 1). As reported previously 
(11), U2AF65 UHM binds SF1 ULM with K d = 127 ± 
48 nM. Inclusion of the helix hairpin in SF1 (SF1 NTD ) 
shows a slightly increased affinity (K d = 84 ± 24 nM), 



indicating that this region contributes to the protein- 
protein interaction. No further increase of binding 
affinity is detected when studying the SF1-U2AF65 
complex using almost full-length U2AF65 
(U2AF65 RRM123 ) and a region comprising all structural 
domains in SF1 (residues 1-260) (K d = 1 14 ± 24 nM). 
We therefore conclude that SF1 NTD and U2AF65 uhKi 
harbour all relevant binding sites for the U2AF65-SF1 
interaction and represent a minimal complex. To identify 
the binding interface between SF1 NTD and U2AF65 UHM , 
we performed NMR chemical shift titrations of the 
isotope-labelled recombinant proteins with the unlabelled 
binding partner. The NMR signals of both SF1 NTD and 
U2AF65 UHM are shifted upon addition of unlabelled 
U2AF65 UHM or SF1 NTD , respectively (Figure 3A). Some 
of the NMR signals in the interface exhibit line broaden- 
ing, which is characteristic for medium- to high-affinity 
complexes with micromolar dissociation constants, and 
thus consistent with the ITC data. We observed large 
chemical shift perturbations (CSPs) of SF1 NTD and 
residues in U2AF65 UHM including numerous residues, 
which are not in contact with SF1 LM in the previously 
reported NMR structure (Figure 3B). In NMR relaxation 
measurements of the complex, both subunits show similar 
15 N Ri and R lp relaxation rates (Supplementary 
Figure S2C) and tumbling correlation times (r c , 
Supplementary Figure S2D), thus indicating that they 
tumble as a single entity. The average r c of 14 ns is in 
good agreement with the correlation time expected for 
the molecular mass of the complex (29kDa; 

^calculated = 1?ns) ^jy 

To determine the 3D structure of the SF1 NTD - 
U2AF65 UHM complex, we employed inter- and 
intra-molecular distance restraints derived from 15 N- and 
13 C-edited NOESY spectra. To unambiguously identify 
intermolecular NOEs, a set of isotope-edited and 
-filtered NOESY spectra was recorded (24,38) and specif- 
ically isotope-labelled protein complexes were used where 
one of the subunits (either SF1 N ™ or U2AF65 UHM ) was 
deuterated and methyl-protonated. Several NOEs were 
detected for ^^C-labelled methyl groups in the 
binding interface (SF1 NTD V39, 140, 153, L56 and 



U2AF65 



UHM 



V458, V460; Supplementary Figure S4). 



For structure calculation of the SF1 NTD -U2AF65 UHM 
complex, we used a protocol described recently (30). 
Shortly, semi-rigid body modelling was performed using 
the previously determined structures of SF1 UL 



U2AF65 umvA and SF1 as input. Comparison of second- 
ary chemical shifts between the free and bound proteins 
and overall similarity of NMR spectra confirm that both 
binding partners do not undergo substantial structural 
changes upon complex formation compared with the 
input structure. Structures were calculated based on 
inter-molecular NOE distances and dihedral angle re- 
straints derived from TALOS+ (28). The final ensemble 
of structures was refined in a shell of water molecules (29) 
and validated by comparing experimental and back- 
calculated SAXS data of the complex (Supplementary 
Table SI). 

The final ensemble of structures of the SF1 NTD - 
complex is shown in Figure 4A and 



U2AF65 



UHM 
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Figure 2. Structure of a novel helix hairpin in the N-terminal domain of SF1. (A) Ribbon representation of the ensemble of the 10 lowest energy 
structures and surface representation coloured according to electrostatic surface potential at 3 /cT/e~ for positive (blue) or negative (red) charge 
potential using the program APBS (36). Close-up view of the N-terminal (B) and C-terminal (C) structured extensions. Side chains of key residues 
mediating the interactions between the a-helices and the N/C termini are shown as sticks. 



Table 1. Binding affinities determined from isothermal titration 
calorimetry 



Sample 


Dissociation constant [K d ] 


SF1 ULM -U2AF65 UHM 


127 ± 48 nM 


SF1 HH -U2AF65 UHM 


n.d. 


sf1 ntd_ u2AF65 uhm 


84 ± 23 nM 


SF1-U2AF65 RRM123 


114 ± 23 nM 


pSFl-U2AF65 RRM123 


96 ± 32 nM 



Supplementary Figure S4. SF1 NTD and U2AF65 UHM 
form an additional hydrophobic interface involving 
U2AF65 UHM Met381, Val458, Val460 and SF1 NTD 
Val39, Ile40, Ile53, Leu56, which is further stabilized by 
a potential salt bridge between SF1 NTD Glu49 and 
U2AF65 UHM Lys462 (Figure 4B). This secondary inter- 
face buries an additional solvent-exposed surface of 
approximately 500 A 2 [determined with PDBePISA (39)]. 
Notably, the additional interface in SF1 HH is remote from 
the ULM; it is located on the opposite side of the 
SFl ULM -binding site of U2AF65 UH] ^ and therefore does 
not interfere with binding of SF1 ULM . The additional 
hydrophobic interface moderately strengthens the SF1- 
U2AF65 interaction, with ITC-derived dissociation 



constants of U2AF65UHM for SF1 ULM and SF1 NTD of 
K d = 127 and 85 nM, respectively. Consistent with this 
moderate contribution to the overall affinity, the inter- 
action of the helix hairpin (SF1 HH ) alone with 



U2AF65 umvi is weak and not detectable by ITC (K d >> 
100 |iM) (Table 1 and Supplementary Figure S5). 

Tandem serine phosphorylation structures and rigidities 
the SF1 HH linker 

Phosphorylation of two serine residues within the linker 
connecting the two a-helices in SF1 NTD has been reported 
recently to enhance formation of the ternary SF1- 
U2AF65-RNA complex (5). To study potential structural 
effects linked to this observation, we prepared 
phosphorylated SF1 NTD (pSFl NTD ) and SF1 (pSFl) by 
in vitro phosphorylation with recombinant KIS kinase 
(Figure 5A). An overlay of NMR spectra comparing 
non-phosphorylated and phosphorylated proteins 
(Figure 5B) reveals large chemical shift changes linked 
to Ser80/Ser82 tandem phosphorylation. NMR chemical 
shifts were re-assigned using a set of standard triple res- 
onance NMR experiments (24). Most CPSs are observed 
in close proximity to the phosphorylation sites at Ser80 
and Ser82 and at the N-terminal end of helix oc2 
(Figure 5C). Strongly affected residues include many posi- 
tively charged residues in helix oc2 such as Arg97 and 
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Figure 3. NMR analysis of the SFl iNlu -U2AF65 UHlvl interaction. (A) Superposition of 'H/^N HSQC NMR spectra of labelled SF1 1N1U and 
U2AF65 UHM free (black) and when bound to unlabelled U2AF65 UH ^ (blue) or SF1 NTD (green), respectively. Selected residues, which are shifted 
upon formation of the SF1 NTD -U2AF65 UHM complex are annotated. (B) CSPs of amides linked to U2AF65 UHM (blue) and SF1 NTD (green) binding. 
To selectively analyse the contributions of the secondary binding interface, the CSPs obtained in a titration of labelled SF1 ULM with unlabelled 
U2AF65 UHM were subtracted from the SF1 NTD CSPs. Residues with strong CSPs that are therefore located in the secondary binding interface are 
annotated. 




Figure 4. Structure of the SF1 and U2AF65 complex. (A) Ribbon representation of the ensemble of the 10 lowest energy structures and 
surface representation coloured according to electrostatic surface potential at 3 kT/e for positive (blue) or negative (red) charge potential using the 
program APBS (36). (B) Close-up view of SF1 NTD -U2AF65 UHM complex interface. Side chains of key residues mediating the interactions between 
the oc-helices and the N/C-termini are shown as sticks. 
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Figure 5. NMR and SAXS analysis of the effect of tandem serine phosphorylation of SF1. (A) SDS-PAGE analysis of phosphorylation of SF1 NTD 
and SF1. The phosphorylated protein (red arrow) migrates slower on the gel than the non- phosphorylated. (B) Superposition of HSQC NMR 

spectra of non-phosphorylated (black) and phosphorylated SF1 NTD and SF1 (red). Selected residues, which are shifted upon phosphorylation are 
annotated. (C) CSPs (red) and residue-specific local correlation times t c are shown for SF1 NTD (blue) and pSFl NTD (red). The secondary structures 
and the phosphorylation sites are depicted above the diagram. The tandem phosphorylated linker region that rigidifies upon phosphorylation is 
highlighted by a box. A ribbon representation of SF1 HH colour coded with the phosphorylation-induced CSPs is shown. (D) SAXS data showing 
comparisons of radial density distributions of non-phosphorylated and phosphorylated SF1 NTD and SF1, in complex with U2AF65 UH , 
U2AF65 RRM123 and with U2AF65 RRM123 -RNA, respectively. 



ArglOO. 15 N Ri and Ri p relaxation rates and local 
tumbling correlation times indicate that the linker con- 
necting helices al and oc2 in SF1 HH becomes rigid upon 
phosphorylation (Figure 5C). Taken together these obser- 
vations suggest that the side chains of Arg93 (linker), 
Arg97 and ArglOO (oc2) mediate salt bridges with the 
two phosphate groups in the SF1 HH linker and thereby 
strongly reduce the conformational flexibility of this 
linker. 



Phosphorylation of SF1 does not significantly alter the 
overall conformation of SF1-U2AF65 

We next tested the structural impact of phosphorylation 
on SF1 as well as on the SFl NTD -U2AF65 UHlU and SF1- 
U2AF65 RRM123 complexes using NMR and SAXS. CSPs 
induced by phosphorylation are very similar for SF1 NTD 
and SF1 (cf. Figure 5C and Supplementary Figure S6) and 
are mainly localized within the SF1 helix-hairpin domain. 



Nucleic Acids Research, 2013, Vol. 41, No. 2 1351 



This indicates that phosphorylation does not induce 
strong intra-molecular contacts between the NTD and 
the KH-QUA2 regions of SF1. A comparison of local 
mobility along the protein backbone derived from 15 N 
NMR relaxation data of phosphorylated SF1 with 
the non-phosphorylated protein (Supplementary 
Figure S6A) reveals that the overall backbone dynamics 
of the protein does not significantly change upon phos- 
phorylation. SAXS data further corroborate the NMR 
results. The radii of gyration (R g ) of SF1 and pSFl are 
comparable (32.5 ± 0.3 versus 29.1 ±0.3 A) and only 
minor differences are seen in the pairwise distance distri- 
bution functions (P(i^), Supplementary Figure S6B). 
Although these changes might indicate compaction 
of a minor fraction of 'open' species in the non- 
phosphorylated protein, the NMR relaxation data and 
the absence of phosphorylation-induced CSPs in 
SF1 KHQUA2 indicate that phosphorylation does not sig- 
nificantly affect the conformation of SF1. 

We next studied the impact of phosphorylation on 
the overall structure of the SFl NTD -U2AF65 tlHM , SF1- 
U2AF65 RRM123 and the ternary SF1-U2AF65 RRM123 - 
RNA complexes using SAXS analysis. Only minor 
differences are observed for the pairwise distribution func- 
tions of the two protein complexes (Figure 5D) and the 
derived radii of gyration (Table 2). In contrast, and similar 
to a recent report (40) binding of a 20-mer RNA contain- 
ing the BPS and the Py tract regions induces large overall 
changes for the structure and/or dynamics of the SF1- 
U2AF65 RRM123 complex and leads to formation of a 
compact SF1-U2AF65 RRM123 -RNA arrangement with 
R g = 39 A ±0.4 and 34.2 ±0.1 A in the absence and 
presence of the RNA ligand, respectively (Table 2). 
Notably, tandem serine phosphorylation of SF1 intro- 
duces only minor differences to the overall arrangement 
of the SF1-U2AF65 complexes (with or without RNA) 
(Figure 5D). This indicates that SF1 Ser80/Ser82 phos- 
phorylation is mainly limited to minor conformational 
changes within SF1 while RNA binding leads to a large 
change in the overall structure and/or dynamics of the 
U2AF65-SF1 complex. 

Role of SF1 HH and phosphorylation for cooperative 
RNA binding 

The role of the helix-hairpin structure and tandem serine 
phosphorylation of SF1 for cooperative binding of SF1 



Table 2. SAXS data and analysis 



Sample R g [nm] Anax [nm] 



SF1 NTD 


2.24 


± 


0.11 


7.8 


P SF1 NTD 


2.18 


± 


0.11 


7.6 


SF1 


3.25 


± 


0.16 


10.8 


pSFl 


2.91 


± 


0.09 


10.2 


SF1 NTD -U2AF65 UHM 


2.48 


± 


0.03 


8.7 


pSFl NTD -U2AF65 UHM 


2.32 


± 


0.02 


7.6 


SF1-U2AF65 RRM123 


3.94 


± 


0.04 


14.0 


pSFl-U2AF65 RRM123 


4.18 


± 


0.04 


14.0 


SF1-U2AF65 RRM123 -RNA 


3.42 


± 


0.01 


11.0 


pSFl-U2AF65 RRM123 -RNA 


3.36 


± 


0.01 


11.0 



and U2AF65 to the pre-mRNA was tested in electro- 
phoretic mobility shift assays (EMSAs). Increasing concen- 
trations of U2AF65 RRM12 ^ added to a 3'-splice site RNA 
result in a smear of U2AF65-RNA complexes (Figure 6A). 
His 6 -tagged SF1 2-320 also binds the RNA, although weakly. 
In the presence of both proteins, a ternary SF1-U2AF65- 
RNA complex forms at lower U2AF65 concentrations, 
consistent with cooperative binding. The ternary complex 
is barely evident in the presence of SF1-AHH and is 
strongly reduced in the presence of SFl-Aoc2. Deletion of 
helix ocl slightly increases ternary complex formation 
and deletion of the linker does not have any effect. Thus, 
in agreement with the data shown above, the helix-hairpin 
domain in the N-terminus of SF1 is necessary for coopera- 
tive RNA binding, with helix oc2 playing a more important 
role than helix ocl. In addition, phosphorylated SF1 
shows a slightly higher efficiency of ternary complex for- 
mation with U2AF65 RRM123 than non-phosphorylated 
SF1 (Figure 6B), consistent with the results of Manceau 
et al. (5). 



DISCUSSION 

Here, we show that the N-terminal region of SF1 
(SF1 NTD ) comprises a helix-hairpin fold. RNA-binding 
assays show that SF1 HH is essential for formation of the 
ternary SF1-U2AF65-RNA complex, as deletion of the 
helix hairpin abolishes cooperative RNA binding. 
Interestingly, deletion of either of the two oc-helices does 
not abrogate formation of the ternary complex, although 
complex formation is reduced in the absence of helix oc2 
(Figure 6A). This suggests that (i) SF1 HH mainly acts as 
spacer between the U2AF65 UHM -bound SF1 UL]V * and the 
SF1 KHQUA2 region and may thus provide an optimal 
orientation of the proteins within the ternary complex 
(Figure 6C). Although deletion of both oc-helices 
strongly shortens the distance between the ULM and 
KH-QUA2 regions in SF1, and therefore may not allow 
proper arrangement of these regions needed for coopera- 
tive RNA binding, the presence of one of the two helices 
is able to rescue this effect, (ii) The observation that 
deletion of helix oc2 did not affect formation of the SF1- 
U2AF65-RNA complex may indicate that residues 
located in helix oc2 are involved in stabilization of the 
ternary complex. Future structural studies of the ternary 
complex should clarify this point. 

Our NMR and structural studies reveal that 
U2AF65 UHM forms a secondary interface with SF1 NTD in 
addition to the previously reported interaction with 
SF1 ULM . The additional interface locks the relative orien- 
tation of U2AF65 UHM and SF1 NTD and is thus likely to be 
important for the specific quaternary arrangement of the 
proteins in the SF1-U2AF65 complex. In addition to 
providing a proper geometry of SF1 and U2AF65 for 
RNA binding, the secondary interface involving the 
SF1 HH and U2AF65 UHM may reduce the entropic costs 
linked to SF1-U2AF65-RNA complex formation by 
providing a prearranged protein-protein scaffold for 
RNA binding. Thus, although the secondary interface 
has a rather small contribution to the overall affinity 
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Figure 6. Cooperative binding of U2AF65 RRM123 and SF1 to a 3'-spUce site RNA. RNA was incubated with buffer or U2AF65 RRM123 (0.2, 0.5, 1 
and 2jiM; indicated by triangles) in the absence or presence of 1.5 uM His 6 -tagged SF1 2-320 or internal deletion mutants (A) or with 6.6 jiM SF1 or 
pSFl (B). Reaction products were separated by native PAGE and visualized by autoradiography. The migration of SF1-U2AF65-RNA complexes 
(closed arrowhead) and SF1-RNA complexes (open arrowhead) is indicated. (C) Role of SF1 HH and the SF1 NTD -U2AF65 UHM interaction in the 
formation of the 3'-splice site recognition complex. SF1 HH may establish an optimal orientation of the SF1 (KH-QUA2) and U2AF65 RRM1 ' 2 
RNA-binding subunits in the complex and thereby support cooperative RNA binding. Tandem phosphorylation of SF1 might contribute to the 
formation of the ternary complex by stabilizing a yet unknown interface. 



between SF1 and U2AF65 (Table 1), it is an essential 
feature for formation of the ternary complex. A specific 
arrangement of SF1 and U2AF enforced by the helix 
hairpin of SF1 may be required to accommodate variations 
in the distance between the BPS and Py tract regions of 
introns by providing a prearranged scaffold of the SF1 
KH-QUA2 and U2AF65 RRM1 ' 2 regions. Furthermore, 
the hydrophobic surface on U2AF65 UHM might mediate 
additional protein-protein interactions with additional 
SFs that could modulate complex assembly and splicing 
regulation. 

Our data show that tandem serine phosphorylation of 
the conserved SPSP site within the linker connecting 
SF1 NTD helices ocl and oc2 has little effect on the conform- 
ation and overall structure of SF1 alone, or the SF1 NTD - 
U2AF65 UHM and the SF1-U2AF65 RRM123 complexes. 
NMR- and ITC-binding data also suggest that phosphor- 
ylation does not alter the protein-protein-binding 
affinities and interactions (Supplementary Figure S7 and 
Table 1) and thus further corroborate these results. 
Nevertheless, the EMSA data indicate that SF1 phosphor- 
ylation enhances cooperative assembly of the SF1- 
U2AF65-RNA complex (Figure 6B) consistent with a 
previous report (5). As SAXS data do not indicate large 
conformational rearrangements linked to phosphoryl- 
ation of SF1 alone or bound to U2AF65 (Figure 5D, 
Supplementary Figure S6B), the contribution to RNA 
binding must be small. The slight improvement in 
cooperative assembly could reflect either additional 
direct contacts with U2AF65 involving the serine- 
phosphorylated SF1 HH linker or contribute to the 
overall stability (and rigidity) of a prearranged SF1- 
U2AF65 complex, by reducing entropy loss linked to 



RNA binding. Structural studies to analyze these effects 
in the ternary complex are currently on-going in our 
laboratory. 

Tandem serine phosphorylation of SF1 may have add- 
itional roles beyond a function in cooperative assembly of 
the constitutive splicing complex. Previous studies in 
mammalian and yeast systems have established that SF1 
mainly affects the kinetics of spliceosome assembly, as 
genetic or biochemical depletion of SF1 does not abolish 
splicing (41,42). In Saccharomyces cerevisiae, it has been 
shown that SF1 is involved in removing introns with sub- 
optimal splice sites and in nuclear pre-mRNA retention 
(43,44), whereas knockdown of human SF1 in HeLa cells 
did not result in a general splicing phenotype (45). 
However, Corioni et al. (46) have recently shown that 
SF1 is not a constitutive SF, but is required for the 
splicing of certain introns and affects alternative splice 
site choice. In this respect, SF1 phosphorylation may 
mediate interactions with other factors that could 
regulate alternative splicing by modulating S'-splice site 
recognition. For example, it has been suggested that the 
phosphorylated SPSP motif of SF1 is recognized by the 
tandem WW domains of FBP11 (47) and the interaction 
of the SF1 and FBP11 orthologues of S. cerevisiae has 
been implicated in mediating intron definition by connect- 
ing the 5'- and S'-splice sites (48). However, we have not 
detected any interaction of tandem WW domains with 
phosphorylated SF1 (data not shown). Nevertheless, the 
substantial conformational changes in the SF1 NTD linker 
associated with tandem serine phosphorylation by KIS 
kinase suggest that interactions with so far unknown 
factors are modulated which may contribute to regulate 
alternative splicing by SF1. 



Nucleic Acids Research, 2013, Vol. 41, No. 2 1353 



STRUCTURE COORDINATES AND NMR DATA 

The coordinates for the SF1 HH structure and the structure 
of the U2AF65 UHM /SF1 NTD complex are deposited in the 
PDB with accession numbers 2m09 and 2m0g, respect- 
ively. Chemical shifts are deposited in the BMRB, acces- 
sion codes: 18802 and 18808. 



SUPPLEMENTARY DATA 

Supplementary Data are available at NAR Online: 
Supplementary Table 1, Supplementary Figures 1-7, 
Supplementary Methods and Supplementary References 
[36,49,50]. 
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